FPGA based accelerator for parallel DBSCAN algorithm

نویسندگان

  • Yue Qi
  • Wang Qin
  • Shaobo Shi
  • Qi Yue
  • Qin Wang
چکیده

Data mining is playing a vital role in various application fields. One important issue in data mining is clustering, which is a process of grouping data with high similarity. Density-based clustering is an effective method that can find clusters in arbitrary shapes in feature space, and DBSCAN (Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise) is a basic one. With the tremendous increase of data sizes, the processing time taken by clustering algorithms can be several hours or more. In recent years, FPGA has provided a notable accelerating performance in data mining applications. In this paper, we study parallel DBSCAN algorithm and map it to FPGA based on the task-level and data-level parallelism architecture. Experimental results show that this accelerator can provide up to 86x speedup over a software implementation on general-purpose processor and 2.9x over a software implementation on graphic processor.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An FPGA-based Geometric Biclustering Accelerator for Genes Microarray Data Analysis

This paper introduces a novel hardware architecture for accelerating geometric biclustering (GBC) algorithm for genes microarray data analysis on FPGA. The proposed FPGAbased accelerator provides high throughput parallel processing capability and improves the speed of GBC computation by 30% compared to purely software implementation written in C language.

متن کامل

بررسی مشکلات الگوریتم خوشه بندی DBSCAN و مروری بر بهبودهای ارائه‌شده برای آن

Clustering is an important knowledge discovery technique in the database. Density-based clustering algorithms are one of the main methods for clustering in data mining. These algorithms have some special features including being independent from the shape of the clusters, highly understandable and ease of use. DBSCAN is a base algorithm for density-based clustering algorithms. DBSCAN is able to...

متن کامل

A Robust Density-Based Clustering Approach Using DBCURE –MapReduce Techniques

Clustering is the process of grouping similar data into clusters and dissimilar data into different clusters. Density-based clustering is a useful clustering approach such as DBSCAN and OPTICS. The increasing volume of data and varying size of data sets lead the clustering process challenging. So that we propose a parallel framework of clustering with advanced approach called MapReduce. We deve...

متن کامل

Experiments in Parallel Clustering with DBSCAN

We present a new result concerning the parallelisation of DBSCAN, a Data Mining algorithm for density-based spatial clustering. The overall structure of DBSCAN has been mapped to a skeletonstructured program that performs parallel exploration of each cluster. The approach is useful to improve performance on high-dimensional data, and is general w.r.t. the spatial index structure used. We report...

متن کامل

Improvement of density-based clustering algorithm using modifying the density definitions and input parameter

Clustering is one of the main tasks in data mining, which means grouping similar samples. In general, there is a wide variety of clustering algorithms. One of these categories is density-based clustering. Various algorithms have been proposed for this method; one of the most widely used algorithms called DBSCAN. DBSCAN can identify clusters of different shapes in the dataset and automatically i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014